Efficient Segmental Cascades for Speech Recognition

نویسندگان

  • Hao Tang
  • Weiran Wang
  • Kevin Gimpel
  • Karen Livescu
چکیده

Discriminative segmental models offer a way to incorporate flexible feature functions into speech recognition. However, their appeal has been limited by their computational requirements, due to the large number of possible segments to consider. Multi-pass cascades of segmental models introduce features of increasing complexity in different passes, where in each pass a segmental model rescores lattices produced by a previous (simpler) segmental model. In this paper, we explore several ways of making segmental cascades efficient and practical: reducing the feature set in the first pass, frame subsampling, and various pruning approaches. In experiments on phonetic recognition, we find that with a combination of such techniques, it is possible to maintain competitive performance while greatly reducing decoding, pruning, and training time.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sequence Prediction with Neural Segmental Models

Segments that span contiguous parts of inputs, such as phonemes in speech, named entities in sentences, actions in videos, occur frequently in sequence prediction problems. Segmental models, a class of models that explicitly hypothesizes segments, have allowed the exploration of rich segment features for sequence prediction. However, segmental models suffer from slow decoding, hampering the use...

متن کامل

Fusion of global statistical and segmental spectral features for speech emotion recognition

Speech emotion recognition is an interesting and challenging speech technology, which can be applied to broad areas. In this paper, we propose to fuse the global statistical and segmental spectral features at the decision level for speech emotion recognition. Each emotional utterance is individually scored by two recognition systems, the global statistics-based and segmental spectrum-based syst...

متن کامل

HMM composition of segmental unit input HMM for noisy speech recognition

For robust speech recognition in noisy environments, various methods have been studied. In this paper, we apply parallel model combination (PMC) for segmental unit input HMM to recognize corrupted speech in additive noise. Since several successive frames are combined and treated as an input vector in segmental unit input modeling, the increased dimension of vector degrades the precision in esti...

متن کامل

A recursive feature vector normalization approach for robust speech recognition in noise

The acoustic mismatch between testing and training conditions is known to severely degrade the performance of speech recognition systems. Segmental feature vector normalization [8] was found to improve the noise robustness of MFCC feature vectors and to outperform other state-of-the-art noise compensation techniques in speaker-dependent recognition. The objective of feature vector normalization...

متن کامل

Rhythm and Tempo Recognition of Music Performance from a Probabilistic Approach

This paper concerns both rhythm recognition and tempo analysis of expressive music performance based on a probabilistic approach. In rhythm recognition, the modern continuous speech recognition technique is applied to find the most likely intended note sequence from the given sequence of fluctuating note durations in the performance. Combining stochastic models of note durations deviating from ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016